Scalable Parallel Algorithms for Solving Sparse Systems of Linear Equations∗
نویسنده
چکیده
We have developed a highly parallel sparse Cholesky factorization algorithm that substantially improves the state of the art in parallel direct solution of sparse linear systems—both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. However, it had been a challenge to developing efficient and scalable parallel formulations of sparse matrix factorization. Our new parallel sparse factorization algorithm is asymptotically as scalable as the best dense matrix factorization algorithms for a wide class of problems that include all twoand three-dimensional finite element problems. This algorithm incurs less communication overhead than any previously known parallel formulation of sparse matrix factorization. It is equally scalable on parallel architectures based on 2-D mesh, hypercube, fat-tree, and multistage networks. In addition, it is the only known sparse factorization algorithm that can deliver speedups in proportion to an increasing number of processors while requiring almost constant memory per processor. We have successfully implemented this algorithm for Cholesky factorization on nCUBE2 and Cray T3D parallel computers. An implementation of this algorithm on the T3D delivers up to 20 GFlops on 1024 processors for medium-size structural engineering and linear programming problems. To the best of our knowledge, this is the highest performance ever obtained for sparse Cholesky factorization on any supercomputer. Numerical factorization is the most time consuming of the four phases involved in obtaining a direct solution of a sparse system of linear equations. In addition to Cholesky factorization, we present efficient parallel algorithms for two other phases—symbolic factorization and for forward and backward substitution to solve the triangular systems resulting from sparse matrix factorization. These algorithms are designed to work in conjunction with our sparse Cholesky factorization algorithm and incur less communication overhead than parallel sparse Cholesky factorization. Along with some recently developed parallel ordering algorithms, the algorithms presented in this thesis make it possible to develop complete scalable parallel direct solvers for sparse linear systems.
منابع مشابه
A high performance two dimensional scalable parallel algorithm for solving sparse triangular systems
Solving a system of equations of the form Tx = y, where T is a sparse triangular matrix, is required after the factorization phase in the direct methods of solving systems of linear equations. A few parallel formulations have been proposed recently. The common belief in parallelizing this problem is that the parallel formulation utilizing a two dimensional distribution of T is unscalable. In th...
متن کاملEfficient Scalable Algorithms for Solving Dense Linear Systems with Hierarchically Semiseparable Structures
Hierarchically semiseparable (HSS) matrix techniques are emerging in constructing superfast direct solvers for both dense and sparse linear systems. Here, we develop a set of novel parallel algorithms for key HSS operations that are used for solving large linear systems. These are parallel rank-revealing QR factorization, HSS constructions with hierarchical compression, ULV HSS factorization, a...
متن کاملParallel Solution of Sparse Linear Least Squares Problemson Distributed - Memory
This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are anal...
متن کاملParallel Solution of Sparse Linear Least Squares Problems on Distributed-Memory Multiprocessors
This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are anal...
متن کاملSolution of Large , Sparse Systems of Linear Equations in MassivelyParallel
We present a general-purpose parallel iterative solver for large, sparse systems of linear equations. This solver is used in two applications, a piezoelectric crystal vibration problem and a superconductor model, that could be solved only on the largest available massively parallel machine. Results obtained on the Intel DELTA show computational rates of up to 3.25 gi-gaaops for these applicatio...
متن کامل